In the sequel, we denote by \(F_{t}\) the cumulative distribution function for year \(t\). We agree on \(\overline{F}_t = 1 - F_t\) and \(F_t(-1)=0\). Henceforth, \(\overline{F}\) is called the survival function.
qx
(age-specific) risk of death at age \(x\), or mortality quotient at given age \(x\) for given year \(t\).
About the definition of \(q_{t,x}\)
Defining and computing \(q_{t,x}\) does not boil down to knowing the number of people at age \(x\) at the beginning of ear \(t\) and knowing how many of them died during year \(t\). If we want to be rigorous, we need to know all life lines in the Lexis diagram, or equivalently, how many people at Age \(x\) were alive on each day of Year \(t\).
Mortality quotients define a probability distribution
For a given year \(t\), the sequence of mortality quotients define a survival function \(\overline{F}_t\) using the following recursion:
\[q_{t,x} = \frac{\overline{F}_t(x) - \overline{F}_t(x+1)}{\overline{F}_t(x)}\] with boundary condition \(\overline{F}_t(-1) =1\).
This artificial probability distribution is used to define and compute life expectancies.
\(q_{t,x}\) is the hazard rate of \(\overline{F}_t\) at age \(x\).
ex:
Residual Life Expectancy at age \(x\) and year \(t\)
This is the expectation of \(X -x\) for a random variable \(X\) distributed according to \(\overline{F}_t\) conditionnally on the event \(\{ X \geq x \}\). That is \(e_{t,x}\) is the expectation of the probability distribution defined by \(\overline{F}_t(\cdot + x-1)/\overline{F}_t(x-1)\).
Rearrangement
Question
From dataframe life_table, compute another dataframe called life_table_pivot with primary key Country, Gender and Year, with a column for each Age from 0 up to 110. For each age column, the entry should be the central death rate at the age defined by column, for Country, Gender and Year identifying the row.
You may use functions pivot_wider, pivot_longer from tidyr:: package.
The resulting schema should look like:
Column Name
Type
Country
factor
Gender
factor
Year
integer
0
double
1
double
2
double
3
double
\(\vdots\)
\(\vdots\)
Question
Using life_table_pivot compute life expectancy at birth for each Country, Gender and Year using formula
\[e_{t,0} = \sum_{x=0}^\infty \overline{F}_t(x)\]
Life expectancy and window functions
Question
Write a function that takes as input a vector of mortality quotients, as well as an age, and returns the residual life expectancy corresponding to the vector and the given age.
Question
Write a function that takes as input a dataframe with the same schema as life_table and returns a data frame with columns Country, Gender, Year, Age defining a primary key and a column res_lex containing residual life expectancy corresponding to the pimary key.
In order to compute residual life expectancies, you may consider using window functions over apropriately defined windows. The next window function suffices to compute life expectancy at birth. It computes the logarithm of survival probabilities for each Country, Year, Gender (partition) at each Age. Note that the expression mentions an aggregation function sum and that the correction of the result is ensured by a correct design of the frame argument.
Question
Compute residal life expectancies at all ages using window functions
Computing residual life expectancies using window functions and accumulate
The official calculation of residual life expectancies assumes that except at age \(0\) and great age, people die uniformly at random between age \(x\) and \(x+1\): \[
e_{t,x} = (1- q_{t,x}) \times (1 + e_{t,x+1}) + \frac{1}{2} \times q_{t,x}
\]
This recursion suggests a more efficient to compute residual life expectancies at all ages.
Indeed, purrr::accumulate() allows to compute all values for \(e_{t,x}\) using exactly one pass over the table.